Using Standardized Lexical Semantic Knowledge to Measure Similarity

نویسندگان

  • Wafa Wali
  • Bilel Gargouri
  • Abdelmajid Ben Hamadou
چکیده

The issue of sentence semantic similarity is important and essential to many applications of Natural Language Processing. This issue was treated in some frameworks dealing with the similarity between short texts especially with the similarity between sentence pairs. However, the semantic component was paradoxically weak in the proposed methods. In order to address this weakness, we propose in this paper a new method to estimate the semantic sentence similarity based on the LMF ISO-24613 standard. Indeed, LMF provides a fine structure and incorporates an abundance of lexical knowledge which is interconnected together, notably sense knowledge such as semantic predicates, semantic classes, thematic roles and various sense relations. Our method proved to be effective through the applications carried out on the Arabic language. The main reason behind this choice is that an Arabic dictionary which conforms to the LMF standard is at hand within our research team. Experiments on a set of selected sentence pairs demonstrate that the proposed method provides a similarity measure that coincides with human intuition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

A Structural-Lexical Measure of Semantic Similarity for Geo-Knowledge Graphs

Graphs have become ubiquitous structures to encode geographic knowledge online. The Semantic Web’s linked open data, folksonomies, wiki websites and open gazetteers can be seen as geo-knowledge graphs, that is labeled graphs whose vertices represent geographic concepts and whose edges encode the relations between concepts. To compute the semantic similarity of concepts in such structures, this ...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Valuing Semantic Similarity

Similarity is a tool widely used in various domains such as DNA sequence analysis, knowledge representation, natural language processing, data mining, information retrieval, information flow etc. Computing semantic similarity between two entities is a non-trivial task. There are many ways to define semantic similarity. Some measures have been proposed combining both statistical information and ...

متن کامل

An improved semantic similarity measure for document clustering based on topic maps

A major computational burden, while performing document clustering, is the calculation of similarity measure between a pair of documents. Similarity measure is a function that assigns a real number between 0 and 1 to a pair of documents, depending upon the degree of similarity between them. A value of zero means that the documents are completely dissimilar whereas a value of one indicates that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014